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Abstract 

Modeling of laser-plasma wakefield accelerators in an optimal frame of reference IH is shown to produce orders of 
magnitude speed-up of calculations from first principles. Obtaining these speedups requires mitigation of a high- 
frequency instability that otherwise limits eff'ectiveness in addition to solutions for handling data input and output 
in a relativistically boosted frame of reference. The observed high-frequency instability is mitigated using methods 
including an electromagnetic solver with tunable coefficients, its extension to accomodate Perfectly Matched Layers 
and Friedman's damping algorithms, as well as an efficient large bandwidth digital filter. It is shown that choosing 
the frame of the wake as the frame of reference allows for higher levels of filtering and damping than is possible in 
other frames for the same accuracy. Detailed testing also revealed serendipitously the existence of a singular time step 
at which the instability level is minimized, independently of numerical dispersion, thus indicating that the observed 
instability may not be due primarily to Numerical Cerenkov as has been conjectured. The techniques developed 
for Cerenkov mitigation prove nonetheless to be very efficient at controlling the instability. Using these techniques, 
agreement at the percentage level is demonstrated between simulations using diff'erent frames of reference, with 
speedups reaching two orders of magnitude for a 0.1 GeV class stages. The method then allows direct and efficient 
full-scale modeling of deeply depleted laser-plasma stages of 10 GeV-1 TeV for the first time, verifying the scaling of 
plasma accelerators to very high energies. Over 4, 5 and 6 orders of magnitude speedup is achieved for the modeling 
of 10 GeV, 100 GeV and 1 TeV class stages, respectively. 

Keywords: laser wakefield acceleration, particle-in-cell, plasma simulation, special relativity, frame of reference, 
boosted frame 
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1. Introduction 

Laser driven plasma waves off'er orders of magnitude increases in accelerating gradient over standard accelerating 
structures 1 2 1 (which are limited by electrical breakdown), thus holding the promise of much shorter particle accelera- 
tors (31. High quality electron beams of energy up-to 1 GeV have been produced in just a few centimeters ||4][5]|6l|3, 
with 10 GeV stages being planned as modules of a high energy collider | 8 |. 

As the laser propagates through a plasma, it displaces electrons while ions remain essentially static, creating a 
pocket of positive charges that the displaced electrons rush to fill. The resulting coherent periodic motion of the 
electrons oscillating around their original position creates a wake with periodic structure following the laser. The 
alternate concentration of positive and negative charges in the wake creates very intense electric fields. An electron (or 
positron) beam injected with the right phase can be accelerated by those fields to high energy in a much shorter distance 
than is possible in conventional particle accelerators. The efficiency and quality of the acceleration is governed by 
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several factors which require precise three-dimensional shaping of the plasma column, as well as the laser and particle 
beams, and understanding of their evolution. 

Computer simulations have had a profound impact on the design and understanding of past and present exper- 
iments 191, with accurate modeling of the wake formation and beam acceleration requiring fully kinetic methods 
(usually Particle-In-Cell) with large computational resources due to the wide range of space and time scales involved 
|[TOl[TTl. For example, modeling 10 GeV stages for the LOASIS (LBNL) BELLA proposal L12J demanded as many 
as 5,000 processor hours for a one-dimension simulation on a NERSC supercomputer |[T3l . Various reduced models 
have been developed to allow multidimensional simulations at manageable computational costs: fluid approximation 
(141, quasistatic approximation LiSJUSIIlZj, laser envelope models [16], scaled parameters |[T8j[T9l. However, the 
various approximations that they require result in a narrower range of applicability. As a result, even using several 
models concurrently does not usually provide a complete description. For example, scaled simulations of 10 GeV 
LPA stages do not capture correctly some essential transverse physics, e.g. the laser and beam betatron motion, which 
can lead to inaccurate beam emittance (a measure of the beam quality). An envelope description can capture these 
effects correctly at full scale for the early propagation through the plasma but can fail as the laser spectrum broadens 
due to energy depletion as it propagates further in the plasma. 

An alternative approach allows for orders of magnitude speedup of simulations, whether at full or reduced scale, 
via the proper choice of a reference frame moving near the speed of light in the direction of the laser 1 1 1. It does 
so without alteration to the fundamental equations of particle motion or electrodynamics, provided that the high- 
frequency part of the light emitted counter to the direction of propagation of the beam can be neglected. This approach 
exploits the properties of space and time dilation and contraction associated with the Lorentz transformation. As 
shown in 1 1 1, the ratio of longest to shortest space and time scales of a system of two or more components crossing at 
relativistic velocities is not invariant under such a transformation (a laser crossing a plasma is just such a relativistic 
crossing). Since the number of computer operations (e.g., time steps), for simulations based on formulations from first 
principles, is proportional to the ratio of the longest to shortest time scale of interest, it follows that such simulations 
will eventually have diff'erent computer runtimes, yet equivalent accuracy, depending solely upon the choice of frame 
of reference. 

The procedure appears straightforward: identify the frame of reference which will minimize the range of space 
and/or time scales and perform the calculation in this frame. However, several practical complications arise. First, 
the input and output data are usually known from, or compared to, experimental data. Thus, calculating in a frame 
other than the laboratory entails transformations of the data between the calculation frame and the laboratory frame. 
Second, while the fundamental equations of electrodynamics and particle motion are written in a covariant form, the 
numerical algorithms that are derived from them may not retain this property, and calculations in frames moving at 
diff'erent velocities may not be successfully conducted with the use of the exact same algorithms. For example, it 
was shown in l|2Qll that calculating the propagation of ultra-relativistic charged particle beams in an accelerator using 
standard Particle-In-Cell techniques lead to large numerical errors, which were fixed by developing a new particle 
pusher. The modeling of a laser plasma accelerator (LPA) stage in a boosted frame involves the fully electromagnetic 
modeling of a plasma propagating at near the speed of light, for which Numerical Cerenkov ||2T1 [23 is a potential 
issue. Third, electromagnetic calculations that include wave propagation will include waves propagating forward and 
backward in any direction. For a frame of reference moving in the direction of the accelerated beam (or equivalently 
the wake of the laser), waves emitted by the plasma in the forward direction expand while the ones emitted in the 
backward direction contract, following the properties of the Lorentz transformation. If one is to resolve both forward 
and backward propagating waves emitted from the plasma, there is no gain in selecting a frame diff'erent from the 
laboratory frame. However, the physics of interest for a laser wakefield is the laser driving the wake, the wake, and 
the accelerated beam. Backscatter is weak in the short-pulse regime, and does not interact as strongly with the beam 
as do the forward propagating waves which stay in phase for a long period. It is thus often assumed that the backward 
propagating waves can be neglected in the modeling of LPA stages. The accuracy of this assumption is shown by 
comparison between explicit codes which include both forward and backward waves and envelope or quasistatic 
codes which neglect backward waves 110, l'9l[23l . 

After the idea and basic scaling for performing simulations of LPA in a Lorentz boosted frame were published 
in m, there have been several reports of the application of the technique to various regimes of LPA FT3l l24l l25l [26l 
[TTl [27l i28J . Speedups varying between several to a few thousands were reported with various levels of accuracy in 
agreement between simulations performed in a Lorentz boosted frames and in a laboratory frame. High-frequency 
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instabilities were reported to develop in 2D or 3D calculations, that were limiting the velocity of the boosted frame 
and thus the attainable speedup [29l [27ll28l . 

In this paper, we present numerical techniques that were implemented in the Particle-In-Cell code Warp | 30| 
for mitigating the numerical Cerenkov instability, including a solver with tunable coefficients, and show that these 
techniques are effective for suppressing the high frequency instability observed in boosted frame simulations. A 
detailed study of the application of these techniques to the simulations of downscaled LPA stages reveals that choosing 
the frame of the wakefield as the reference frame allows for more aggressive application of the standard techniques 
mitigating numerical Cerenkov, than is possible in laboratory frame simulations. It is shown that the instability that 
develops with high-boost frames is well controlled, allowing for the first time 2D and 3D simulations of LPA in the 
wakefield frame, for 100 GeV and 1 TeV class stages, achieving the maximum theoretical speedups of over 10^ and 
10^ respectively. 

This paper is organized as follows. The theoretical speedup expected for performing the modeling of a LPA stage 
in a boosted frame is derived in Section 2. Section 3 addresses the issue of input and output of data in a boosted frame. 
High frequency instability issues and remedies are presented in Section 4. These techniques enable accurate modeling 
of 0.1 GeV-1 TeV LPA stages. Stage modeling results are presented in section 5, and observed speedup is contrasted 
to the theoretical speedup of section 2. 



2. Theoretical speedup dependency with the frame boost 

The obtainable speedup is derived as an extension of the formula that was derived in d, taking in addition into 
account the group velocity of the laser as it traverses the plasma. In 1 1 1, the laser was assumed to propagate at the 
velocity of light in vacuum during the entire process, which is a good approximation when the relativistic factor of 
the frame boost y is small compared to the relativistic factor of the laser wake y^^ in the plasma. The expression is 
generalized here to higher values of y, for which the actual group velocity of the wake in the plasma must be taken 
into account. We shall show that for a 10 GeV class LPA stage, the maximum attainable speedup is above four orders 
of magnitude. 

Assuming that the simulation box is a fixed number of plasma periods long, which implies the use (which is 
standard) of a moving window following the wake and accelerated beam, the speedup is given by the ratio of the time 
taken by the laser pulse and the plasma to cross each other, divided by the shortest time scale of interest, that is the 
laser period. Assuming for simplicity that the wake propagates at the group velocity of plane waves in a uniform 
plasma of density rie, the group velocity of the wake is given by 



v^/c=/3^ = ^l + ^j (1) 

where ojp = ^J (nge^) / (eorue) is the plasma frequency, o) = IncjA is the laser frequency, 6o is the permittivity of 
vacuum, c is the speed of light in vacuum, and e and are respectively the charge and mass of the electron. 

In the simulations presented herein, the runs are stopped when the last electron beam macro-particle exits the 
plasma, and a measure of the total time of the simulation is given by 

T = '-^ (2) 

where Ap ^ Inc/ojp is the wake wavelength, L is the plasma length, and Vp = /3pC are respectively the velocity 
of the wake and of the plasma relative to the frame of reference, and t] is an adjustable parameter for taking into 
account the fraction of the wake which exited the plasma at the end of the simulation. The numerical cost Rt scales as 
the ratio of the total time to the shortest timescale of interest, which is the inverse of the laser frequency, and is thus 
given by 

Tc + r]Ap) 
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In the laboratory, = and the expression simphfies to 
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where y = 1 / ^/T^-0^. 

The expected speedup from performing the simulation in a boosted frame is given by the ratio of Riat and R* 
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Assuming that y « y„, and that pw ~ I (which is a valid approximation for most practical cases of interest), this 
expression is consistent with the expression derived in 1 1 1 for the LPA case which states that R*^ = aR,/ (1 +yS) with 
a = (I -yS + l/L) / (1 + l/L), where / is the laser length which is generally proportional to rjAp, and S = Rt/Rj. 

The linear theory predicts that for the intense lasers (a>l) typically used for acceleration, the laser depletes 
its energy over approximately the same length = A^,/2A^ over which the particles dephase from the wake |2|. 
Acceleration is compromised beyond and in practice, the plasma length is proportional to the dephasing length, i.e. 



L = ^Ld- In most cases, » 1, which allows the approximations p„ ^ 1- A^/2A^, and L = ^A?pl2A? ~ ^^■^ 



TjAp, so that Eq.( 12 ( becomes 



2, ,2 
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(13) 



For low values of y, i.e. when y « y„, Eq.( 13 ( reduces to 



=(l+;8)2y2 



(14) 



Conversely, if y ^ oo, Eq.( 13 ( becomes 



Finally, in the frame of the wake, i.e. when y = y„„ assuming that jS„, 

2 _ 2 



1, Eq.( 13 ( gives 



(15) 



(16) 



1+2;,/^' 

Since 77 and ^ are of order unity, and the practical regimes of most interest satisfy 7^ » 1, the speedup that is 
obtained by using the frame of the wake will be near the maximum obtainable value given by Eq.(p3]). 

Note that without the use of a moving window, the relativistic effects that are at play in the time domain would 
also be at play in the spatial domain, as shown in |T|, and the scaling would transform to y^. In the frame of the 
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wake, there is no need of the moving window, thus simplifying the procedure, while in a frame traveling faster than 
the wake in the laboratory, a moving window propagating in the backward direction is needed. However, the scaling 
shows that there would be very little gain in doing the latter. 



2.7. Estimated speedup for 0.1-100 GeV stages 



Formula ( 13 ) is used to estimate the speedup for the calculations of 100 MeV, 1 GeV, 10 GeV and 100 GeV class 
stages, assuming a laser wavelength A = O.Syum. Using parameters and scaling laws from |18|, the corresponding 
initial plasma densities Ue are respectively lO^^cc, lO^^cc, lO^^cc and lO^^cc, while the plasma lengths L are 1.5 mm, 
4.74 cm, 1.5 m, and 47.4 m, with ^ ^ 1.63. For these values, the wake wavelengths Ap are respectively 10.6//m, 
33.4/im, 106.yL/m, 334.yum, and relativistic factors y^^ are 13.2, 41.7, 132 and 417. In the simulations presented in this 
paper, the beam is injected near the end of the wake period (first "bucket"). In first approximation, the beam has 
propagated through about half a wake period to reach full acceleration, and we set t] ^ 0.5. For a beam injected into 
the n^^ bucket, t] would be set to ^ - 1 /2. If positrons were considered, they would be injected half a wake period 
ahead of the location of the electrons injection position for a given period, and one would have ?] = n - 1. For the 



parameters considered here, L ; 
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Figure 1: Speedup versus relativistic factor of the boosted frame from Eq.fB) for 100 MeV - 100 GeV LPA class stages. 

The speedup versus the relativistic factor of the boosted frame y is plotted in Fig. [T] As expected, for low values 
of 7, the speedup scales as ( 14 ), and asymptotes to a value slightly lower than 2^^ for large values of y. It is of interest 
to note that the qualitative behavior is identical to the one obtained in 1 1 1 (see Fig. 1 and accompanying analysis) in 
the analysis of the crossing of two rigid identical beams, confirming the generality of the generic analysis presented 
in yj. For a 100 GeV class stage, the maximum estimated speedup is as large as 300,000. 



3. Input and output to and from a boosted frame simulation 

This section describes the procedures that have been implemented in the Particle-In-Cell framework Warp |[30| to 
handle the input and output of data between the frame of calculation and the laboratory frame. Simultaneity of events 
between two frames is valid only for a plane that is perpendicular to the relative motion of the frame. As a result, the 
input/output processes involve the input of data (particles or fields) through a plane, as well as output through a series 
of planes, all of which are perpendicular to the direction of the relative velocity between the frame of calculation and 
the other frame of choice. 
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3.1. Input 
3.1.1. Particles 

Particles are launched through a plane using a technique which applies to many calculations in a boosted frame, in- 
cluding LPA, and is illustrated using the case of a positively charged particle beam propagating through a background 
of cold electrons in an assumed continuous transverse focusing system, leading to a growing transverse instability 1 1 1. 
In the laboratory frame, the electron background is initially at rest and a moving window is used to follow the beam 
progression. Traditionally, the beam macroparticles are initialized all at once in the window, while background elec- 
tron macroparticles are created continuously in front of the beam on a plane that is perpendicular to the beam velocity. 
In a frame moving at some fraction of the beam velocity in the laboratory frame, the beam initial conditions at a given 
time in the calculation frame are generally unknown and one must initialize the beam differently. However, it can be 
taken advantage of the fact that the beam initial conditions are often known for a given plane in the laboratory, either 
directly, or via simple calculation or projection from the conditions at a given time. Given the position and velocity 
{x, y, z, Vx^Vy^Vz) for each beam macroparticle at time ^ = for a beam moving at the average velocity = PbC (where 
c is the speed of light) in the laboratory, and using the standard synchronization = = at ^ = = 0) between the 
laboratory and the calculation frames, the procedure for transforming the beam quantities for injection in a boosted 
frame moving at velocity fie in the laboratory is as follows (the superscript ' relates to quantities known in the boosted 
frame while the superscript * relates to quantities that are know at a given longitudinal position z* but different times 
of arrival): 

1. project positions at z* = assuming ballistic propagation 

f = (z--z)/v, (17) 

X* = x-Vxt* (18) 

/ = y-vyf (19) 

z* = (20) 

the velocity components being left unchanged, 

2. apply Lorentz transformation from laboratory frame to boosted frame 



= -yt* (21) 

x'* = jc* (22) 

/* = f (23) 

z* = y/Sct' (24) 



7(1 -m) 



(25) 



v; = -^T^TTT (26) 



7(1 -m 



(27) 



where 7=1/ ^Jl -JS^- With the knowledge of the time at which each beam macroparticle crosses the plane into 
consideration, one can inject each beam macroparticle in the simulation at the appropriate location and time. 
3. synchronize macroparticles in boosted frame, obtaining their positions at a fixed t'(= 0) which is before any 
particle is injected 



= z'*-v;*r (28) 



This additional step is needed for setting the electrostatic or electromagnetic fields at the plane of injection. 
In a Particle-In-Cell code, the three-dimensional fields are calculated by solving the Maxwell equations (or 
static approximation like Poisson, Darwin or other 11201 ) on a grid on which the source term is obtained from 
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the macroparticles distribution. This requires generation of a three-dimensional representation of the beam 
distribution of macroparticles at a given time before they cross the injection plane at z'*. This is accomplished 
by expanding the beam distribution longitudinally such that all macroparticles (so far known at different times 
of arrival at the injection plane) are synchronized to the same time in the boosted frame. To keep the beam 
shape constant, the particles are "frozen" until they cross that plane: the three velocity components and the 
two position components perpendicular to the boosted frame velocity are fixed, while the remaining position 
component is advanced at the average beam velocity. As particles cross the plane of injection, they become 
regular "active" particles with full 6-D dynamics. 




2' (m) 

Figure 2: (top) Snapshot of a particle beam "frozen" (grey spheres) and "active" (colored spheres) macroparticles traversing the injection plane (red 
rectangle), (bottom) Snapshot of the beam macroparticles (colored spheres) passing through the background of electrons (dark brown streamlines) 
and the diagnostic stations (red rectangles). The electrons, the injection plane and the diagnostic stations are fixed in the laboratory plane, and are 
thus counterpropagating to the beam in a boosted frame. 

Figure [2] (top) shows a snapshot of a beam that has passed partly through the injection plane. As the frozen beam 
macroparticles pass through the injection plane (which moves opposite to the beam in the boosted frame), they are 
converted to "active" macroparticles. The charge or current density is accumulated from the active and the frozen 
particles, thus ensuring that the fields at the plane of injection are consistent. 

3.1.2. Laser 

Similarly to the particle beam, the laser is injected through a plane perpendicular to the axis of propagation of the 
laser (by default z). The electric field E_i_ that is to be emitted is given by the formula 

E± (x, y, t) = Eq/ (x, y, t) sin [cot + (x, y, co)] (29) 

where Eo is the amplitude of the laser electric field, / (x, y, t) is the laser envelope, oj is the laser frequency, (x, y, cS) 
is a phase function to account for focusing, defocusing or injection at an angle, and t is time. By default, the laser 
envelope is a three dimensional gaussian of the form 

/ (X, J, t) = ^-(-V2.^//2.^c^.V2.9 (30) 

where cTx, o-y and cr^ are the dimensions of the laser pulse; or it can be defined arbitrarily by the user at runtime. If 
(x, y,(jS) = \, the laser is injected at a waist and parallel to the axis z. 
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If, for convenience, the injection plane is moving at constant velocity PsC, the formula is modified to take the 
Doppler eff'ect on frequency and amplitude into account and becomes 

E^ix^yj) = (1 -yS,)£o/fej,Osin[(l - Ps) cot + ^{x^y^co)] . (31) 

The injection of a laser of frequency co is considered for a simulation using a boosted frame moving at fie with 
respect to the laboratory. Assuming that the laser is injected at a plane that is fixed in the laboratory, and thus moving 
at Ps = -p in the boosted frame, the injection in the boosted frame is given by 

E^{x\y\f) = (l-/3s)E'^f(x\y\f)sm[(l-/3s)oj'f^(P(x\y\oj')] (32) 
= (Eo/y) f {x\y\ f) sin [ojf /y + (/, /, oj')] (33) 

since E^Eq = oj'/oj = 1/ (1 + p)y. 

The electric field is then converted into currents that get injected via two dual 2-D arrays of macro-particles, with 
one positive and one negative macro-particle per cell in the plane of injection, whose weights and motion are governed 
by E_^ (x\y\ f). Injecting using these dual arrays of macroparticles off'er the advantages of automatically including 
the longitudinal component which arise from emitting into a boosted frame, and to verify the discrete Gauss law 
thanks to the use of the Esirkepov current deposition scheme |31 1. 

The technique implemented in Warp presents several advantage over other procedures that have been proposed 
elsewhere |23l|28l. In |28|, the laser beam is initialized entirely in the computational box, leading to larger boxes 
transversely in a boosted frame, as the Rayleigh length of the laser shortens and the overall laser pulse radius rises, 
eventually off'setting the benefits of the boosted frame. The transverse broadening of the box is avoided in 1 13 1 at the 
cost of a more complicated injection scheme, requiring to launch the laser from all but one faces of the simulation 
box. The method presented here avoids the caveat of the broadening and retains simplicity with a standard injection 
technique through one plane. 

3.2. Output 

Some quantities, e.g. charge, are Lorentz invariant, while others, like dimensions perpendicular to the boost 
velocity, are the same in the laboratory frame. Those quantities are thus readily available from standard diagnostics 
in the boosted frame calculations. Quantities which do not fall in this category are recorded at a number of regularly 
spaced "stations", immobile in the laboratory frame, at a succession of time intervals to record data history, or averaged 
over time. A visual example is given on Fig. [2] (bottom). Since the space-time locations of the diagnostic grids in the 
laboratory frame generally do not coincide with the space-time positions of the macroparticles and grid nodes used for 
the calculation in a boosted frame, some interpolation is performed at runtime during the data collection process. As 
a complement or an alternative, selected particle or field quantities are dumped at regular interval for post-processing. 
The choice of the methods depends on the requirements of the diagnostics and particular implementations. 

4. High frequency instability and Numerical Cerenkov 

As reported in 1271 and 1281 , for simulations using a boosted frame at y > 10-20 (depending on parameters), 
a fast growing short wavelength instability was observed developing at the front of the plasma (see Fig. [3]). The 
presence and growth rate of the instability was observed to be very sensitive to the resolution (slower growth rate at 
higher resolution), choice of field solver, and to the amount of damping of high frequencies and smoothing of short 
wavelengths. The instability is always propagating at some angle from the longitudinal axis, and is observed in 2D 
and 3D runs but was never observed in any of the ID runs performed by the authors. When modeling an LPA setup in 
a relativistically boosted frame, the background plasma is traveling near the speed of light and it has been conjectured 
1281 that he observed instability might be caused by numerical Cerenkov. We investigate in this paper whether the 
instability that is observed in boosted frame simulations of LPA is indeed of numerical Cerenkov type and if the cures 
aimed at mitigating numerical Cerenkov are effective. 

Due to spatial and time discretization of the Maxwell equations, numerical light waves may travel faster or slower 
on the computational grid than the actual speed of light in vacuum c, with the magnitude of the eff'ect being larger at 
short wavelength, where discretization errors are the largest. When the numerical speed is lower than c, it is possible 
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Figure 3: Snapshot of a surface plot of the longitudinal field from a 2-1/2D simulation of a full scale lOGeV LPA in a boosted frame at 7 = 130 
(elevation is proportional to the magnitude of the electric field). The laser is propagating from left to right and the plasma from right to left. A fast 
growing short wavelength instability is developing at the front of the plasma. 



for fast macro-particles to travel faster than the wave modes, leading to numerical Cerenkov effects that may result 
in instabilities f2T] [22l |32j |33l |34l . The effect was studied analytically and numerically in detail for one-dimensional 
systems in [32 , Several solutions were proposed: smoothing the current deposited by the macro-particles |21 , 32], 
damping the electromagnetic field |[34l|35]|36l, solving the Maxwell equations in Fourier space 12211 . or using a field 
solver with a larger stencil to provide lower numerical dispersion f34l. 

Several of the abovementioned techniques to mitigate numerical Cerenkov and high frequency errors have been 
implemented in Warp. All the simulations presented in this paper employed cubic splines for current deposition and 
electromagnetic force gathering between the macro-particles and the grid |37|, whose beneficial eff'ects on standard 
LPA PIC simulations have been demonstrated in (3811 . In addition, a Maxwell solver with tunable coefficients was 
implemented, as well as a damping scheme, and filtering of the deposited current and gathered electromagnetic fields, 
which are described in this section. The use of Fourier based Maxwell solvers is not considered in this paper. 

4.1. Wideband lowpass digital filtering 

It is common practice to apply digital filtering to the charge or current density in Particle-In-Cell simulations, for 
smoothing or compensation purpose [47 J . The most commonly used filter is the three points filter 

<l/.=acl>j + {l-a)'^^^^^^ (34) 

where 0^ is the filtered quantity. This filter is called a binomial filter when a = 0.5. Assuming = e^^^ and 
(p^ = g (of, k) e^^^, where g is the filter gain, which is function of the filtering coefficient a and the wavenumber we 
find from ^ that 

g{a,k) = a -\- (I - a) cos (kSx) (35) 
. (36) 

For n successive applications of filters of coefficients ai...an, the total attenuation G is given by 

n 

1=1 

If an = n - o;i then G ^ I -\- O (k^^^, providing a sharper cutoff' in k space. Such step is called a compensation 
step BtI . For the bilinear filter (a = 1/2), the compensation factor is = 2 - 1/2 = 3/2. For a succession of n 
applications of the bilinear factor, it is ac = n/2 -\- I. The gain versus wavelength is plotted in Fig. |4]for the bilinear 
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Figure 4: Gain versus wavelength for the bilinear filter without compensation (g = g(l/2,k)), with compensation (g • C3/2 = g(l/2,k) • g(3/2,k)), 
and n-pass bilinear filters with compensation (g" • c^^ = g(l/2, ky ■ giac,k)) for n = 4, 20, 50 and 80. 



filter without compensation (G = g(l/2,k)), with compensation (G = g(l/2,k) • g(3/2,k)), and four n-pass bilinear 
filters with compensation (G = g(l/2,ky • g(3/2,k)) for ^ = 4, 20, 50 and 80. 

The bilinear filter provides complete suppression of the signal at the grid Nyquist wavelength (twice the grid cell 
size). Suppression of the signal at integers of the Nyquist wavelength can be obtained by using a stride s in the filter 

(P^j = a(Pj + (1 - a) ^ (39) 

for which the gain is given by 

g(s,a,k) = a -\- (I - a) cos (skSx) (40) 

. l-(l-a)^-f^.0{k^) (41) 

The gain is plotted in Fig. |5](top) for four passes bilinear filters with compensation (G = g(s, 1 /2, ^)^ • g(s, 3/2, k)) 
for strides s=l to 4. For a given stride, the gain is given by the gain of the bilinear filter shifted in k space, with the 
pole ^ = shifted from A = 2/Sx to A = 2s /Sx, with additional poles, as given by 

skSx = arccos ( — ^— | (mod 2n) (42) 
\a - 1/ 

The resulting filter is pass band between the poles, but since the poles are spread at diff'erent integer values in k space, 
a wide band low pass filter can be constructed by combining filters at diff'erent strides. Examples are given in Fig. [5] 
(bottom) for combinations of the filter with stride 1 to 4. 

The combined filters with strides 2, 3 and 4 have nearly equivalent f ail-off s in gain (in linear scale) to the 20, 50 
and 80 passes of the bilinear filter (see Fig. |6]). Yet, the filters with stride need respectively 10, 15 and 15 passes of 
a three-point filter while the n-pass bilinear filer need respectively 21, 51 and 81 passes, giving gains of respectively 
2.1, 3.4 and 5.4 in number of operations in favor of the filters with stride. 
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Figure 5: (top) gain for four passes bilinear filters with compensation (G^ = gis, l/2,k)^ ■ g(s,3/2,k)) for strides s=l to 4 linear with (left) 
linear ordinate (right) logarithmic ordinate; (bottom) gain for four low pass filters combining the Gi to G4 filters with (left) linear ordinate (right) 
logarithmic ordinate. 
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4.2. Tunable solver 

In Il39ll and 1401 , Cole introduced an implementation of the source-free Maxwell's wave equations for narrow- 
band applications based on non-standard finite-differences (NSFD). In fTT], Karkkainen et al adapted it for wideband 
applications. At the Courant limit for the time step and for a given set of parameters, the stencil proposed in [41 1 
has no numerical dispersion along the principal axes, provided that the cell size is the same along each dimension 
(i.e. cubic cells in 3D). The solver from [41 1 was modified to be consistent with the Particle-In-Cell methodology and 
implemented in the code Warp, with the ability given to the user of setting the solver adjustable coefficients, providing 
tunability of the numerical properties of the solver to better fit the requirements of a particular application. 

The "Cole-Karkkainnen"'s solver fA\\ uses a non-standard finite diff'erence formulation (extended stencil) of the 
Maxwell-Ampere equation. For implementation into a Particle-In-Cell code, the formulation must introduce the 
source term into Cole-Karkkainen's source free formulation in a consistent manner. However, modifying the NSFD 
formulation of the Maxwell- Ampere equation so that it includes the source term in a way that is consistent with the cur- 
rent deposition scheme is challenging. To circumvent this problem. Warp implementation departs from Karkkainen's 
by applying the enlarged stencil on the Maxwell-Faraday equations, which does not contain any source term but 
is formally equivalent to the source-free Maxwell-Ampere equation. Consequently, in Warp's implementation, the 
discretized Maxwell- Ampere equation is the same as in the Yee scheme, and the discretized Maxwell's equations 
read: 



A^E 

V-E 
V* -B 



-V* xE 
c^V X B - 
P 

^0 





J 

^0 



(43) 
(44) 

(45) 
(46) 



where 6o is the permittivity of vacuum, and Eq. |45] and |46] not being solved explicitly but verified via appropriate 
initial conditions and current deposition procedure. The diff'erential operators are defined as 



V 

V* 

the finite diff'erences and sums operators being 

a: 



h^x + AyS + A^z 
A:x + A;y + A*z, 



h,j,k h,j,k 



5t 



G\l 



i+l/2,j,k 



U2J,k 



6x 



with 



^xG\ij^k ~ *^l;!i+l/2,*:+l/2 + G|(J-l/2,it+l/2 
G\1j^i/2,k-l/2 + G\"j-i/2,k-l/2 



(47) 
(48) 



(49) 

(50) 
(51) 



(52) 



(53) 



The quantity G is a sample vector component, 6t and 6x are respectively the time step and the grid cell size along 
X, while Q-.yS andy are constant scalars verifying a + 4(S + 4y = 1. The operators along j and z, i.e. Ay, A^, A*, A*, 5^, 
5 Sy, and S^, are obtained by circular permutation of the indices. 
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In 2D, assuming the plane (x, z), the enlarged finite operators simplify to 



l/2,k- 



(54) 
(55) 



An extension of this algorithm for non-cubic cells provided by Cowan in | 43 1 is not considered in this paper. How- 
ever, all considerations given here for the solver implemented in Warp apply readily to the solver developed by Cowan. 



4.2.7. Numerical dispersion 

The dispersion relation of the solver is given by 



sm 



codt 



cSt 



with 



sm 



k^Sx 



Sx 



/ . ky6y \ 

sm ^ 



Sy 



sm 



k,6z 



Sz 



Cy = a-\- 2p(Cz -h c^) -h 

= a-\- ipicx + Cy) + AyCxCy 



and 



Cr = 



Cv = 



C7 = 



COS (kxS^) 

COS (kySy^ 

COS (k^Sz) 



(56) 



(57) 
(58) 
(59) 



(60) 
(61) 
(62) 



The Courant-Friedrichs-Lewy condition (CFL) is given by 



cStc < min[^x, Sy, Sz, 

1/ - 47) max + Ky, Kx + /c^, Ky + /c^j, 
1/ ^{a - 4yS -h Ay) [k^ + Ky + k,)\ 



(63) 



where Kx = l/Sx^, Ky = l/Sy^ and = l/Sz^. 



Assuming cubic cells (Sx = Sy = Sz), the coefficients given in BTl (a = 7/12, ft = 1/12 and y = 1/48) allow 
cSt = Sx, and thus no dispersion along the principal axes. 
It is of interest to note that SSE^ can be rewritten 



• ojSt \2 



sm 



cSt 



{si + + ^^2^2 ^ ^2^2 ^ ^2^2^ ^ y {sls^sl) 



(64) 



with Sx = sm(kxSx/2), Sy = sin(^3;^3;/2^, s^ = sin (k^Sz/ 2), J3' = -8/5 - I67 and Y = 48^, for which the coefficients 
from BTIl take the nice values J3' = -I and Y = I. 

Sets of possible coefficients and the corresponding CFL condition, assuming cubic cells, are given in Table[T] The 
numerical dispersion using those coefficients are plotted in figure [7] along the principal axes and diagonals for cubic 
cells (Sx = Sy = Sz) and contrasted with the one of the Yee solver (all taken at each solver's CFL time step limit). At 
the CFL limit, the Yee algorithm off'ers no numerical dispersion along the 3D diagonal, but relatively large numerical 
dispersion at the Nyquist frequency along the main axes. Conversely, the Cole-Karkkainen solver (CK) off'ers no 
numerical dispersion along the main axes but significant dispersion along the diagonals. The CK solver also allows 
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Yee 


CK 


CK2 


CK3 


CK4 


CK5 







-1 


-1/2 





-1/2 


-9/10 


y' 





1 


1/2 


-1 





9/10 


a 


1 


7/12 


19/24 


11/12 


3/4 


5/8 







1/12 


1/24 


1/24 


1/16 


3/40 


7 





1/48 


1/96 


-1/48 





3/160 


c6t/6x 


1/V3 


1 


1/V2 


1/V2 


V2/V3 


V5/V6 



Table 1 : List of coefficients 



larger time steps than the Yee solver by almost a factor of two in 3D. The solver labeled "CK2" offers numerical 
dispersion that is intermediate between the Yee solver and the CK solver along the main axes and the 3D diagonal, but 
slightly degraded along the 2D diagonal. Conversely, while solver CK3 also offers intermediate numerical dispersion 
along the main axes and the 3D diagonal, it offers no numerical dispersion along the 2D diagonal. Solver CK4 
improves slightly the numerical dispersion along the main axes over CK2 and CK3 at the expense of the dispersion 
along the diagonals. Finally, CK5 offers the highest level of isotropy. The CFL time steps of solvers CK2, 3, 4 and 
5 are intermediate between the Yee and the CK CFL time steps. This provides solvers with a range of numerical 
dispersion among which some may be more favorable with regard to the mitigation of numerical instabilities for a 
given application. 

To reduce numerical dispersion to its lowest level, it is desirable to operate the CK solver as close as possible to 
the CFL limit cSt = Sx. However, an instability (other than numerical Cerenkov) arises at the Nyquist frequency in 
such a case. The analysis is given in ID in Appendix I, as well as its mitigation using digital filtering. Since for the 
CK solver, the CFL limit is independent of dimensionality, the analysis and mitigation apply readily to 2D and 3D 
simulations. 

For absorption of outgoing waves at the computational box boundaries, the extension of the solver to a Perfectly 
Matched Layer |44 | is given in Appendix IL 
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Figure 7: Numerical dispersion along the principal axis and diagonals 
adjustable numerical dispersion using the parameters from Table[2 



V2Jt V3Jt 

for cubic cells (Sx = Sy = Sz) at the Courant limit for the solver with 
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4.2.2. Current deposition and Gauss ' Law 

In most applications, it is essential to prevent accumulations of errors to the discretized Gauss' Law. This is 
accomplished by providing a method for depositing the current from the particles to the grid which is compatible with 
the discretized Gauss' Law, or by providing a mechanism for "divergence cleaning" ||47l|48l|49l|50l. For the former, 
schemes which allow a deposition of the current that is exact when combined with the Yee solver is given in | 51 1 for 
linear form factors and in | 31 1 for higher order form factors. Since the discretized Gauss' Law and Maxwell-Faraday 
equation are the same in our implementation as in the Yee solver, charge conservation is readily verified using the 
current deposition procedures from 1511 and (311 , and this was verified numerically. Hence divergence cleaning is not 
necessary. 



4.3. Friedman adjustable damping 

The tunable damping scheme developed by Friedman f36l was shown to be the most potent practical method for 
mitigating the numerical Cerenkov instability in |34|, among the selected methods that were considered. It is readily 



applicable to the solver presented above by modifying ( 43 ) to 



g«+3/2 ^ g«+l/2 _ ^^y* 



with 



2 



where < ^ < 1 is the damping factor. The numerical dispersion becomes 

oj6t \2 



Sin ■ 



cSt 



where 



and 



26/sin^ (ojSt/2) 



2e 



-ia)6t . 



sm 



Sx 



/ . kySy \2 

' sm ^ * 



Sy 



sm 



Sz 



The CFL is given by 



cStl = cStr 



2 + e 

2 + 36/ 



(65) 



(66) 



(67) 



(68) 



(69) 



(70) 



where dtc is the critical time step of the numerical scheme without damping {6 = 0), as given by ( |63) ). 

The numerical dispersion of the Cole-Karkkainen-Friedman (CKF) solver (using the coefficients from the CK 
solver in Table [T]) is plotted in figure [8] along the principal axis and diagonals for cubic cells (Sx = Sy = Sz) and 
contrasted with the one of the Yee-Friedman (YF) solver (both taken at the Courant time step limit). The amount 
of phase error rises with the value of the damping parameter 6 (partly due to the slightly more constraining limit on 
the critical time step). However, it was shown in |34| that the amount of damping provided by the YF solver was 
sufficient to counteract the slight degradation of numerical dispersion with raising 6, reducing the numerical Cerenkov 
eff'ects to an acceptable level for the problem that was considered. The damping is very isotropic with the CKF 
solver but not with the YF one. The YF implementation off'ers a higher level of damping of the shortest wavelengths 
along the 3D diagonals, while the CKF off'ers more damping along the axes, and the amount of damping along the 2D 
diagonals are similar. In summary, the YF implementation delivers respectively the highest/lowest level of damping in 
the direction of lowest/highest numerical dispersion, while the CKF implementation delivers a proportionally higher 
level of dispersion than the YF implementation along the direction of highest numerical dispersion. Thus it may be 
expected that the CKF implementation will be more efficient in reducing numerical Cerenkov eff'ects. 
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Figure 8: Numerical dispersion along the principal axis and diagonals for cubic cells (Sx = Sy = Sz) at the Courant limit for: (left) the Yee-Friedman 
solver; (right) the Cole-Karkkainen-Friedman solver. The real part (phase) and the imaginary part (amplitude) are plotted respectively in the top 
and bottom rows. 
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5. Application to the modeling of laser wakefield acceleration 

This section presents applications of the methods to the modehng of 10 GeV LPA stages at full scale in 2-1/2D 
and 3D, which has not been done fully self-consistently with other methods. It has been shown that many parameters 
of high energy LPA stages can be accurately simulated at reduced cost by simulating stages of lower energy gain, with 
higher density and shorter acceleration distance, by scaling the physical quantities relative to the plasma wavelength, 
and this has been applied to design of 10 GeV LPA stages (181 [191. The number of oscillations of a mismatched laser 
pulse in the plasma channel however depends on stage energy and does not scale, though this effect is minimized for 
a channel guided stage as considered in [18, 19 1. The number of betatron oscillations of the trapped electron bunch 
will also depend on the stage energy, and may affect quantities like the emittance of the beam. For these reasons, and 
to prove validity of scaled designs of other parameters, it is necessary to perform full scale simulations, which is only 
possible by using reduced models or simulations in the boosted frame. 

As a benchmarking exercise, we first perform scaled simulations similar to the ones performed in |18|, at a 
density of He = 10^^ cm"^, using various values of the boosted frame relativistic factor y to show the accuracy and 
convergence of the technique. These stages were shown to efficiently accelerate both electrons and positrons with low 
energy spread, and the scaled simulations predicted acceleration of hundreds of pC to 10 GeV energies using a 40 
J laser. The accuracy of the technique is evaluated by modeling scaled stages |T8l [191 at 0. 1 GeV, which allows for 
a detailed comparison of simulations using a reference frame ranging from the laboratory frame to the frame of the 
wake. Excellent agreement is obtained on wakefield histories on axis, beam average energy history and momentum 
spread at peak energy, with speedup over a hundred, in agreement with the theoretical estimates from Section 2. The 
downscaled simulations are also used for an in-depth exploration of the effects of the methods presented in Sections 3 
and 4, and evaluation of which techniques are required to permit maximum y boost while maintaining high accuracy. 
We then apply the boosted frame technique to provide full scale simulation of high efficiency quasilinear LPA stages 
at higher energy, verifying the scaling laws in the 10 GeV-1 TeV range. 

5. 1 . Scaled 10 GeV stages 

The parameters were chosen to be close (though not identical) to the case where kpL = 2 in fT8l| where kp is 
the plasma wavenumber and L is the laser pulse length. In the cases considered in this paper, the beam is composed 
of test particles only, with the goal of testing the fidelity of the algorithm in modeling laser propagation and wake 
generation. The results from simulations of LPA in a boosted frame where beam loading is present will be presented 
elsewhere. These simulations are scaled replicas of 10 GeV stages that would operate at actual densities of 10^^ cm"^ 
idSKlQ and allow short run times to permit effective benchmarking between the algorithms. The main physical and 
numerical parameters of the simulation are given in Table [2] Unless noted otherwise, in all the simulations presented 
herein, the field is gathered from the grid onto the particles directly from the Yee mesh locations, i.e. using the 'energy 
conserving' procedure (see BTl . chapter 10). 



5.1.1. Using standard numerical techniques 

These runs were done using the standard Yee solver with no damping, and with the 4-pass stride- 1 filter plus 
compensation, similarly to the simulations reported in [18 1 . No signs of detrimental numerical instabilities were 
observed at the resolutions reported here with these settings. 

The approximate relativistic factor of the wake that is formed by the laser traveling in the plasma is given, ac- 
cording to linear theory, by = IncjAcDp where cOp = ^neC^/eome is the electron plasma frequency. For the given 
parameters, y^^ ^ 13.2. Thus, Warp simulations were performed using reference frames moving between y = I (lab- 
oratory frame) and 13. For a boosted frame associated with a value of y approaching y^^ in the laboratory, the wake 
is expected to travel at low velocity in this boosted frame, and the physics to appear somewhat different from the one 
observed in the laboratory frame, in accordance to the properties of the Lorentz transformation. Figure [9] and \T0\ show 
surface renderings of the transverse and longitudinal electric fields respectively, as the beam enters its early stage of 
acceleration by the plasma wake, from a calculation in the laboratory frame and another in the frame at y = 13. The 
two snapshots offer strikingly different views of the same physical processes: in the laboratory frame, the wake is fully 
formed before the beam undergoes any significant acceleration and the imprint of the laser is clearly visible ahead of 
the wake; while in the boosted frame calculation, the beam is accelerated as the plasma wake develops, and the laser 
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Table 2: List of parameters for scaled lOGeV class LPA stage simulation. 



beam radius 


Rb 


82.5 nm 


beam length 


U 


85. nm 


beam transverse profile 




exp(-rV87??) 


UCctlll iUllglLUU-llldl piUlllC 




exp y Z 1 ^L^^ J 


laser wavelength 


A 


u.o yum 


laser lengtn (rWrlM) 


L 


iU.Uo yum 


normalized vector potential 


ao 


i 


laser longitudinal profile 




sin {ttz/ L) 


plasma density on axis 


Jig 


10^^ cm"^ 


plasma longitudinal profile 




flat 


plasma length 


L 


1.5 mm 


plasma entrance ramp profile 




half sinus 


plasma entrance ramp length 




4 Jim 


number of cells in x 




75 


number of cells in z 




860 (r = 13)-1691 (r = 1) 


cell size in x 


6x 


0.65fim 


cell size in z 


6z 


A/32 


time step 


5t 


Sit CFL limit 


particle deposition order 




cubic 


# of plasma particles/cell 




1 macro-e" + l macro-p^ 



imprint is not visible on the snapshot. Close examination reveals that the short spatial variations which make the laser 
imprint in front of the wake are transformed into time variations in the boosted frame of y = 13. 

Histories of the perpendicular and longitudinal electric fields recorded at a number of stations at fixed locations in 
the laboratory off'er direct comparison between the simulations in the laboratory frame (7=1) and boosted frames at 



7 = 2, 5, 10 and 13. Figure 11 and 12 show respectively the transverse and longitudinal electric fields collected at the 
positions z = 0.3 mm and z = 1.05 mm (in the laboratory frame) on axis (x = y = 0). The agreement is excellent and 
confirms that despite the apparent diff'erences from snapshots taken from simulations in diff'erent reference frames, 
the same physics was recovered. This is further confirmed by the plot of the average scaled beam energy gain as a 
function of position in the laboratory frame, and of relative longitudinal momentum dispersion at peak energy (Fig. 



13 ). The small diff'erences observed on the mean beam energy histories and on the longitudinal momentum spread are 
attributed to a lack of convergence at the resolution that was chosen. The beam was launched with the same phase in 
the 2-1/2D and the 3D simulations, resulting in lower energy gain in 3D, due to proportionally larger laser depletion 
eff'ects in 3D than in 2-1/2D. 

The CPU time recorded as a function of the average beam position in the laboratory frame (Fig. [T3]-middle) 
indicates that the simulation in the frame of y = 13 took 25 s in 2-1/2D and ^ 150 s in 3D versus ^ 5,000 s in 
2-1/2D and ^ 20, 000 s in 3D in the laboratory frame, demonstrating speedups of ^ 200 in 2-1/2D and ^ 130 in 3D, 
between calculations in a boosted frame at y = 13 and the laboratory frame. 

All the simulations presented so far in this section were using the Yee solver, for which the Courant condition is 

given by cSt < (l/Sx^ + 1/fe^) in 2D and cSt < (l/Sx^ + l/Sy^ + 1/fe^) in 3D where St is the time step and 
Sx, Sy and Sz are the computational grid cell sizes in x, y and z. As y was varied, the transverse resolution was kept 
constant, while the longitudinal resolution was kept at a constant fraction of the incident laser wavelength Sz = ^A, 
such that in a boosted frame, Sz* = ^A* = ^(l -\- j3) yA. As a result, the speedup becomes, when using the Yee solver 

^ feVlMx2 + l/fe2 

J yeelD = o , (/I) 

(5* VlMx2 + 1/(5Z*2 
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Figure 9: Colored surface rendering of the transverse electric field from a 2-1/2D Warp simulation of a laser wakefield acceleration stage in the 
laboratory frame (top) and a boosted frame at 7 = 13 (bottom), with the beam (white) in its early phase of acceleration. The laser and the beam are 
propagating from left to right. 



in 2D and 



> yee?>D 



(72) 



in 3D where S is given by Eq. ( 13 ). 

The speedup versus relativistic factor of the reference frame is plotted in Fig. 



T4| from ([13]), ^ and (|72]), 



and contrasted with measured speedups from ID, 2-1/2D and 3D Warp simulations, confirming the scaling obtained 
analytically. 



21 




Figure 10: Colored surface rendering of the longitudinal electric field from a 2-1/2D Warp simulation of a laser wakefield acceleration stage in the 
laboratory frame (top) and a boosted frame at 7 = 13 (bottom), with the beam (white) in its early phase of acceleration. The laser and the beam are 
propagating from left to right. 




Figure 11: History of transverse electric field at the position x = _y = 0, z = 0.3 mm and z= 1.05 mm (in the laboratory frame) from simulations in 
the laboratory frame (7=1) and boosted frames at 7 = 2, 5, 10 and 13. 
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Figure 12: History of longitudinal electric field at the position x = _y = 0, z = 0.3 mm and z = 1.05 mm (in the laboratory frame) from simulations 
in the laboratory frame (7=1) and boosted frames at y = 2, 5, 10 and 13. 
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Figure 13: (top) Average scaled beam energy gain and (middle) CPU time, versus longitudinal position in the laboratory frame from simulations; 
(bottom) distribution of relative longitudinal momentum dispersion at peak energy, in the laboratory frame (7=1) and boosted frames at 7 = 2, 5, 
10 and 13. 
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5.1.2. Effect of filtering, solver with adjustable dispersion and damping 

The modeling of full scale stages, which allow for higher values of y for the reference frame, is more prone to 
the high frequency instability that was mentioned in a previous section, as we will show below. In anticipation of the 
application of the method presented above to mitigate the instability, simulations of the scaled stage were conducted 



using the Yee solver with digital filter S( 1:2:4) as described above (Fig. 15 ), the Cole-Karkkainen solver (Fig. 16) or 



the Yee-Friedman solver (Fig. 11). 

Smoothing with the wideband filter S( 1:2:4) did not produce significant degradations for the calculation in the 
wake frame (y = 13) but did otherwise. The calculations with the Yee solver and the Cole-Karkkainen solver gave 
identical results, validating our implementation of the CK solver. Despite the more expensive stencil, the run with the 
CK solver was almost 40% faster, due to a time step larger by ^^2. Similarly to filtering, damping aggressively did 
not degrade the result in the range 10 < y < 13 but did significantly in the range 1 < y < 5. Comparing the timings 



with those of Fig. 13 (middle-left) shows that the smoothing and the damping added less than a factor of two of total 
runtime to the simulations. 
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Figure 15: ( (left) Average scaled beam energy gain and (right) CPU time, versus longitudinal position in the laboratory frame from simulations 
in the laboratory frame (y = 1) and boosted frames at y = 2, 5, 10 and 13, using the Yee solver with digital filter S(l:2:4) (grey cross is reference 
from run with filter S(l)). 




Figure 16: ( (left) Average scaled beam energy gain and (right) CPU time, versus longitudinal position in the laboratory frame from simulations in 
the laboratory frame (7=1) and boosted frames at y = 2, 5, 10 and 13, using the Cole-Karkkainen solver with filter S(l) (red curve is reference 
from calculation with Yee solver and filter S(l)). 

Those results lead to several observations: (i) while the grid dimensions and number of cells were chosen such that 
square cells were obtained for y = 13, meaning a larger dispersion in the longitudinal direction with the Yee solver 
than with the Cole-Karkkainen solver, both gave the same result. This is significant since for simulations of LPA in 
the laboratory frame reported in the literature, the need to have nearly perfect numerical dispersion in the longitudinal 
direction imposes a constraint on the cell aspect ratio and thus on resolution ||45l|46l. This constraint is removed when 
simulating in the frame of the wake (y = 13 ^ y^^); (ii) damping of high frequencies with the Yee-Friedman solver 
or wideband smoothing of short wavelength have a negligible eff'ect on accuracy for simulations in the frame of the 
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Figure 17: ( (left) Average scaled beam energy gain and (right) CPU time, versus longitudinal position in the laboratory frame from simulations in 
the laboratory frame (7=1) and boosted frames at y = 2, 5, 10 and 13, using the Yee-Friedman solver with 6=1 (grey cross is reference from run 
with no damping). 



wake, but degrade the accuracy very significantly for slower moving reference frames. The dependency of the eff'ect of 
damping and smoothing with y boost has two causes. First, simulations with a boost y ~ 7vv require fewer time steps 
than simulations using a lower value of y. Thus, for a given value of the damping coefficient 0, the integrated amount 
of damping will be lower for the simulations with y ^ yw Second, as mentioned above in the discussion of the surface 
renderings shown in Fig. [9]and[T0| a large fraction of the short wavelength content that is present in the simulations 
in the laboratory frame is transformed into time oscillations in simulations in the wake frame. Hence, filtering short 
wavelength has less eff'ect on the physics when calculating in the wake frame than when calculating in the laboratory 
frame; (iii) the cost of using even the most aggressive damping or smoothing is low, especially considering that the 
simulations presented here were using only two plasma macro-particles per cell. 

In summary, calculating in a boosted frame near the frame following the wake (y ^ y^) relaxes the constraint on 
the numerical dispersion in the direction of propagation of the laser (which is essential in simulations in the laboratory 
frame), and allows for more aggressive damping of high frequencies and smoothing of short wavelengths than is 
possible in standard laboratory frame calculations. 



5.2. Full scale 10 GeV class stages 

As noted in 1 13], full scale simulations using the laboratory frame of 10 GeV stages at plasma densities of 10^^ 
cm"-^ are not practical on present computers in 2D and 3D. At this density, the wake relativistic factor y^ ^ 132, and 
2-1/2D and 3D simulations were done in boosted frames up to y = 130. 




Figure 18: (left) Average beam energy gain versus longitudinal position (in the laboratory frame), (right) Fourier Transform of the longitudinal 
electric field at t=40 ps, averaged over whole domain, from 2D- 1/2 simulations of a full scale lOGeV LPA in a boosted frame at y = 130, using the 
Yee solver and various digital filter kernels. Square cells (Sx = Sz = 6.5//m) and the CFL time step (cSt/Sz -II V2) were used. 
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Figure 19: Average beam energy gain versus longitudinal position (in the laboratory frame) from 2D- 1/2 simulations of a full scale lOGeV LPA in 
a boosted frame at y = 30, 60 and 130, using the Yee solver. 



5.2.1. Simulations in 2-1/2D 

Fig. [18] shows the average beam energy gain versus longitudinal position and the averaged Fourier Transform of 



the longitudinal electric field taken at t=40 ps, from 2D- 1/2 simulations of a full scale lOGeV LPA in a boosted frame 



at 7 = 130, using the Yee solver and various smoothing kernels. Fig. [19] shows the average beam energy gain versus 
longitudinal position from simulations in boosted frames at y = 30, 60 and 130. All runs gave the same beam energy 
history within a few percents, and no sign of instability is detected in the Fourier transform plot of the longitudinal 



electric field. The average energy gain peaks around 8 GeV, in agreement with the scaled simulations (see Fig. 13 ). 



5.2.2. Simulations in 3D 

In 3D, all simulations at y = 130 using the Yee solver (using cubic cells and a time step at the CFL limit) developed 
the instability and loss of the beam, regardless of the amount of filtering or damping that has been tried. The failure of 
the 3D simulations using the Yee solver motivated use of the Cole-Karkkainen-Friedman (CKF) solver, with various 
levels of filtering and damping. Data from 3D simulations using the CKF solver and various smoothing kernels are 
plotted in Fig. [20] Stability is attained when using a sufficient level of filtering. Damping is detrimental to stability at 
low levels (6 - 0.1) but is beneficial at a higher level {Q - 0.5). 

Next, simulations using the solver coefficients CK2-5 from Table [T] were performed, with the time step set at their 
respective CFL limit. The best results were obtained using solvers CK2 and CK3, while CK4 and CK5 did not off'er 
substantial improvement over the CK solver. The results from the runs using CK2 and CK3 were nearly identical and 



hence only thoses from CK2 are reported in Fig. 21 which show very consistent beam energy gain histories, and no 
sign of instability in the Fourier Transform plot of the longitudinal electric field at t=40 ps (closer inspection revealed 
that when using the lowest level of filtering S(l), a mild instability was developing but it was not aff'ecting the average 
beam energy gain history). As shown on Fig. [22] the results at y = 30 - 125 are in excellent agreement while the run 
at 7 = 130 predicts a slightly lower energy gain, all within 10 percent of the maximum energy gain predicted around 
5.7 GeV by the scaled simulations shown on Fig. [T3] (top-right). 

In summary, the full scale 6-7 GeV simulations using the frame of the wake performed in this subsection show: 
(i) 2-1/2D simulations using the Yee solver at the CFL limit (with square cells) were free of instability; (ii) 3D 
simulations using the CK solver developed moderately strong instabilities that were mitigated using moderate to high 
levels of damping and/or filtering, the latter being the most eff'ective; (iii) 3D simulations using the CK2 (or CK3) 
solver developed very mild instabilities that were mitigated with a low level of filtering. 
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Figure 20: (left) Average beam energy gain versus longitudinal position (in the laboratory frame); (right) Fourier Transform of the longitudinal 
electric field at t=40 ps, averaged over plane on axis perpendicular to laser polarization, from 3D simulations of a full scale lOGeV LPA in a 
boosted frame at 7 = 130, using the Cole-Karkkainen-Friedman solver and various smoothing kernels, with (top) no numerical damping (6 = 0), 
(middle) damping with 6 = 0.1 and (bottom) ^ = 0.5. 
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Figure 21: (left) Average beam energy gain versus longitudinal position (in the laboratory frame), (right) Fourier Transform of the longitudinal 
electric field at t=40 ps, averaged over whole domain, from 3D simulations of a full scale lOGeV LPA in a boosted frame at 7 = 130, using the 
CK2 solver and various digital filter kernels. 




Figure 22: Average beam energy gain versus longitudinal position (in the laboratory frame) from 3D simulations of a full scale lOGeV LPA in a 
boosted frame at 7 = 30, 60, 120, 125 and 130, using the Yee solver (7 = 30 and 60) and the CK2 solver (7 = 120 - 130), with digital filter S(l) 
and with the time step set by cSt/Sz = 1 / V2 for stability (see discussion below) . 
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5.3. Effects of numerical parameters on the observed instability 




Figure 23: Fourier Transform of the longitudinal electric field at t=40 ps, averaged over plane on axis perpendicular to laser polarization, from 
(left) 3D and (right) 2D- 1/2 simulations of a full scale lOGeV LPA in a boosted frame at y = 130, using the Yee solver and various smoothing 
kernels. The same time step at the 3D CFL limit c6t = 6x/ ^/3 was used for both simulations. 



The Fourier transform of the longitudinal electric field averaged over the whole domain at t=40 ps, from 3D 



simulations using the Yee solver, is given in Fig. 23 (left). It is contrasted to the same data taken from 2-1/2D 



simulations (right). Both simulations used the same time step at the 3D CFL limit cSt = Sz/ V3. The similarity of the 
two plots indicates that the degradation of the numerical dispersion that resulted from going from the 2D to the 3D 
CFL limit is the cause of the failure of the 3D runs using the Yee solver. Taking advantage of this observation, we 
study in this section the instability arising in 2-1/2D simulations using a time step at the 3D CFL limit. 



5.3.1. Effects of spatial resolution 

Snapshots of the longitudinal electric field at the front of the plasma taken at ^ = 12.5 ps, and their corresponding 
Fourier transform, are given in Fig. 24 from 2-1/2D simulations using the Yee solver with the time step at the 3D 
CFL limit cSt = 6zl Three resolutions were considered: (a) = = 13yum, (b) = = ^.Sfim, and (c) 

= = 3.25yum. The amplitude of the instability is roughly inversely proportional to the resolution. For this 
configuration, the instability exhibits two primary modes at various relative levels, both at a fixed number of grid 
cells in the longitudinal direction, but at a fixed absolute length in the transverse direction. This indicates that the 
transverse part of the modes is governed by the physical geometry of the problem while the longitudinal part is 
governed by numerical resolution. 

Results from 2-1/2D simulation using the CK solver at the 3D CFL limit cSt/Sz = 1/ V3 at the resolution Sx = 
Sz = 6.5/j, m are given in Fig. 25 The same two modes that were observed in the plots from the equivalent simulation 
using the Yee solver (see Fig. 24 middle), are present, and the overall amplitude of the instability is similar. These 
similarities on the details of the instability between the Yee and CK solvers indicate that the diff'erences in numerical 
dispersion of the solvers do not constitute a key factor aff'ecting the instability. 



5. 3. 2. Effects of time step 

It is striking that all the solvers that lead to the lowest levels of instability had the same CFL time step cStcFL = 
Sz/ V2. For checking whether this is coincidental, simulations were performed using the CK solver, scanning the time 
step between cSt/Sz = 0.5 and cSt/Sz = 1. The Fourier Transform of the longitudinal field averaged over the entire 
domain taken at ^ = 40 ps, is given in Fig. 26 exhibiting a sharp reduction of the instability level in a narrow band 
around cSt = Sz/ V2. Since the numerical dispersion degrades in all directions when the time step diminishes, this 
indicates that the value of the time step value is of more importance than the numerical dispersion of the solver being 
used. 

Simulations using the Yee or the CK solver with the singular time step cSt = Sz/ ^ were performed and produced 
levels of instabilities that were much reduced (and delayed) compared to the 3D CFL time step (not shown). The 
snapshot of the electric field and its Fourier Transform taken at ^ = 49 ps are given in Fig. 27 The Fourier spectrum 
is very similar in each case, although the instability is slightly stronger with the CK solver than with the Yee solver. 
In both cases, the instability is easily removed by using the S(l:2) filter (see Fig. [28]). 
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As mentioned in the previous section, the solvers CK, CK4 and CK5, which all have a CFL time step above the 
singular time step cSt = Sz/ V2, produced significant levels of instability when run at their CFL limit. It was verified 
that using those solvers in 3D at the time step cSt = Sz/ ^ resulted in greatly reduced levels of instability. It was also 
observed that running the Yee solver using non-cubic cells, e.g. with lower resolution transversely such as 6x = 2Sz at 
y = 130, or Sx = 2.6dz at y = 50, produced the same pattern: a significant instability was present when using the CFL 
time step and was greatly reduced by using cdt - 6zl V2. Hence for the suppression of the instability, the choice of the 
solver seems to depends solely on whether its CFL condition allows stability at the special time step cdt - dzl ^ for 
a given grid cell aspect ratio, but not significantly on its numerical dispersion nor on the value of the grid cell aspect 
ratio. 



533. Effects of field gathering procedure 

The scan of time step was repeated using the 'momentum conserving' procedure 1471 , in which the field values 



are interpolated at the grid nodes before being gathered onto the particles. The result is given in Fig. 29 With the 



momentum conserving procedure, the level of instability is consistently high and independent of the time step. Since 
the numerical dispersion of the solver varies substantially with the time step, this result supports the conclusion that 
the instability may not be of numerical Cerenkov nature. The identification of the nature of the instability and the 
explanation of the singular time step cdts call for a multidimensional (no instability was observed in ID regardless of 
the field gathering method) analysis of the discretized Vlasov algorithm that was employed, which is left for future 
work. 

The results that were obtained lead to the following conclusions: (i) the time step c^^^ = fe/ V2 consistently pro- 
duces the lowest levels of instability, regardless of dimensionality (2D vs 3D), the field solver being used, resolution, 
aspect ratio of cells (within the range of the finite number of cases that were experimented); (ii) the main advantage 
of the tunable field solver resides in allowing access to the singular time step c6ts rather than providing improved 
numerical dispersion, which consequently do not appear to be a primary driver of the instability; (iii) the instability is 
not completely removed at cdts and filtering is still needed, albeit at lower levels; (iv) the field gathering procedure 
is key, as the existence of a singular time step at which the instability is greatly reduced is observed using an 'energy 
conserving' procedure, but not using a 'momentum conserving' procedure. These results indicate that the instability 
that is being observed may not be a type of numerical Cerenkov instability, as originally conjectured. 
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Figure 24: (left) Snapshot of the longitudinal electric field (£^//) at the front of the plasma at f = 12.5 ps; (right) Fourier Transform of the longitudinal 
electric field, from 2-1/2D simulations of a full scale lOGeV LPA in a boosted frame at y = 130, using the Yee solver, for (top) Sx = Sz = 13//m; 
(middle) Sx = Sz = 6.5//m; (bottom) Sx = Sz = 3.25//m. The time step at the 3D CFL limit cSt = Sz/ V3 was used for all three simulations. 



33 






-3.6 -3.4 

Z (mm) 



10 15 20 25 30 



Figure 25: (left) Snapshot of the longitudinal electric field {En) at the front of the plasma at f = 12.5 ps; (right) Fourier Transform of the longitudinal 
electric field, from 2-1/2D simulations of a full scale lOGeV LPA in a boosted frame at y = 130, with the CK solver, using 5x = 6z = 6.5//m, and 
the time step at the 3D CFL limit c6t = 6x1 ^/3. 




34 




Z (mm) 



10 15 20 25 30 
VAz 



Figure 27: (left) Snapshot of the longitudinal electric field (£//) at the front of the plasma at f = 49 ps; (right) Fourier Transform of the longitudinal 
electric field, from 2-1/2D simulations of a full scale lOGeV LPA in a boosted frame at y = 130, using 5x = 6z = 6.5/um, and the time step at the 
2D CFL limit c6t = 6z/ ^/2, for (top) the Yee solver; (bottom) the CK solver. 
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Figure 28: Snapshot of the longitudinal electric field (E//) at the front of the plasma at f = 49 ps from 2-1/2D simulations of a full scale lOGeV 
LPA in a boosted frame at y = 130, using Sx = = 6.5//m, and the time step at the 2D CFL limit cSt = Sz/ ^l2, for (left) the Yee solver; (right) the 
CK solver. The filter S(l:2) was used to remove the instability that is visible in Fig.|27| The remaining feature is the wake. 
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Figure 29: Fourier Transform of the longitudinal electric field at t=40 ps, averaged over the whole domain, from 2-1/2D simulations of a full 
scale lOGeV LPA in a boosted frame at 7 = 130, using the CK solver, for time steps between cdtjdz - 0.5 and cdtfdz - 1, using a 'momentum 
conserving' field gathering scheme. 
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5.4. Full scale 100 GeV - 1 TeV class stages 




Mean beam position (km) Mean beam position (m) 



Figure 30: Average beam energy gain versus longitudinal position (in the laboratory frame) for simulations at Ue = 10 cc down to 10^ cc, using 
frames of reference between y = 12 and y = 1300, in 2-1/2D (left) and 3D (right). 



Using the knowledge acquired from the 10 GeV class study, simulations of stages in the range of 0.1 GeV-1 TeV 
were performed in 2-1/2D and in the range of 0.1-100 GeV in 3D. The plasma density rie scales inversely to the energy 
gain, from 10^^ cc down to 10^^ cc in the 0.1 GeV-1 TeV range. These simulations used the parameters given in Table 
[2] scaled appropriately, and used the high speed of the boosted simulations to allow fast-turnaround improvement of 
the stage design |[T8|[T9l. Scaled energy gain was increased by adjusting the phase of the beam injection behind the 
laser by ~ 12% in 3D and 7% in 2D, with respect to the results presented in the preceding section. The 5% level 
difference between the 2D and 3D beam phases is likely due to small differences in wake structure, laser depletion, 
and the small number of betatron oscillations of the laser. To minimize beam loss, the beam dimensions were reduced 
by a factor of 3 in each dimension. Simulations showing performance of this design in 2-1/2D were performed using 
the Yee solver with filter S(l) for the 0.1-10 GeV runs, S(l:2) for the 100 GeV and S(l:2:3) for the 1 TeV ones. The 
3D simulations were performed using the CK2 solver with filter S(l) for the 0.1-1 GeV runs, and S(l:2) for the 10-100 
GeV ones. The average beam energy gain history is plotted in Fig. |30| scaling the 0.1-100 GeV runs to the 1 TeV 
range in 2-1/2D, and the 0.1-10 GeV runs to the 100 GeV range in 3D. The results exhibit an excellent agreement on 
the peak scaled beam energy gain between 0.1-100 GeV runs, and on the scaled beam energy gain histories between 
the 1-100 GeV runs. A higher level of smoothing was needed for the ITeV case, explaining the deviation past 1 km. 
This deviation is of little importance in practice, where one is mostly interested in the beam evolution up-to the peak 
energy point. The differences at 10^^ on the scaled beam energy gain history can be attributed to the effects from 
having only a few laser oscillations per pulse. 

Using ([13]), the speedup of the full scale 100 GeV class run, which used a boosted frame of y = 400 as frame of 
reference, is estimated to be over 100,000, as compared to a run using the laboratory frame. Assuming the use of a 
few thousands of CPUs, a simulation that would require several decades to complete using standard PIC techniques 
in the laboratory frame, was completed in four hours using 2016 CPUs of the Cray system at NERSC. With the same 
analysis, the speedup of the 2-1/2D 1 TeV stage is estimated to be over a million. 
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6. Conclusion and outlook 

The technique proposed in 1 1 1 was appHed successfully to speedup by orders of magnitude calculations of laser- 
plasma accelerators from first principles. The theoretical speedup estimate from (11 was improved, while complica- 
tions associated with the handling of input and output data between a boosted frame and the laboratory frame were 
discussed. Practical solutions were presented, including a technique for injecting the laser that is simpler and more 
efficient than methods proposed previously. 

Control of an instability that was limiting the speedup of such calculations in previous work is demonstrated, via 
the use of a field solver with tunable coefficients and digital filtering. The tunable solver was shown to be compatible 
with existing "exact" current deposition techniques for conservation of Gauss Law, and accommodates Perfectly 
Matched Layers for efficient absorption of outgoing waves. 

Extensive testing of the methods presented for numerical Cerenkov mitigation reveals that choosing the frame of 
the wake as the frame of reference allows for higher levels of filtering and damping than is possible in other frames with 
the same accuracy. It also revealed that there exists a singular time step for which the level of instability is minimal, 
independently of other numerical parameters, especially the numerical dispersion of the solver. This indicates that 
the observed instability may not be caused by numerical Cerenkov eff'ects. Analysis of the nature of the instability is 
underway, but regardless of cause, the methods presented mitigate it eff'ectively. The tunability of the field solver is 
key in providing stability in 3D at the singular time step, which is not attainable by the standard Yee solver. 

The use of those techniques permitted the first calculations in the optimal frame of 10 GeV, 100 GeV and 1 
TeV class stages, with speedups over 4, 5 and 6 orders of magnitude respectively over what would be required by 
"standard" laboratory frame calculations, which are impractical for such stages due to computational requirements. 

These results show that the technique can be applied to the modeling of 10 GeV stages, and future work will 
include the effects of beam loading, plasma density ramps, as well as particle trapping in the near future. Future work 
on the numerical methods include a comprehensive analysis of the instability and the existence of a singular time 
step under certain conditions, as well as the local application of filtering, smoothing and/or mesh refinement ISTllSSl 
around the front of the plasma, where the instability develops. The latter is expected to provide mitigation of the 
instability while preserving accuracy in the core of the simulation. 



7. Appendix I: One dimensional analysis of the CK solver 

Although the most interesting applications of the CK solver require two or three dimensions, analysis of the 



method in one dimension reveals a potential issue when cdt = Sx. In one dimension (choosing x), Equations ([43])-([44|) 
reduce to 

By\::!l2 = 5>'C+£(£.l?.i-£.ir) (73) 
= E,\'; + —{B,Cl^-ByCll^)-J- (74) 

Due to uniform time discretization and linearity, the response of the system ([73])-([74|) to arbitrary distributions and 
evolutions of sources (i.e. macro-particles) can be written as the sum of its response to the excitation from a Heaviside 
function in time, at one location in the grid. Assuming a source term of the form J\^ = Hit) where H is the Heaviside 
function, and setting the time step at the Courant limit cdt = Sx, the system (73 )-([74|) produces a spurious "odd-even" 



oscillations at the Nyquist frequency, as shown in Fig. [31] (middle-left). If a sinusoidal signal oscillating at the Nyquist 
frequency is added to the source term, the amplitude of the spurious oscillation grows linearly with time, as shown in 



Fig. 31 (middle-right). The spurious oscillation is effectively suppressed in both cases by the application of a "1-2-1" 



bilinear digital filter, as shown in Fig. 31 (bottom) . These types of filtering are of common use in Particle-In-Cell 
codes, often repeated a prescribed number of times and followed by a compensation stage to avoid excessive damping 
of long wavelengths 1 47 1 . 

The impact of the spurious oscillations and the effectiveness of the bilinear filtering at suppressing it in actual 
simulations was tested on a ID simulation of a scaled wakefield acceleration stage. The physical and numerical 
parameters of the simulation are given in table |3] Snapshots of the transverse electric field (aligned with the laser 
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Figure 31: (top) time history (in time steps) of the current source for (left) a Heaviside step (right) a heaviside step modulated by a sinusoidal 
oscillation at the Nyquist frequency; (middle) response of the system of equations f73)-f74} via a snapshot of the electric field after 10 time steps, 
without filtering of the source term; (bottom) response of the system of equations f73)-([74) with application of bilinear digital filter of the source 
term in space. A time step of cSt = Sx was used in all runs and scaled constants c = eq = 1 were assumed. 

polarization) and the plasma electron phase space, taken once the laser has propagated about half way through the 
plasma (after -20,000 time steps) are given in Fig. [32] Without filtering of the current density, an instability develops 
at the grid Nyquist frequency, severely disrupting the plasma wake, despite the fact that cubic splines were used to de- 
posit current from macro-particles to the grid and gather the electromagnetic field from the grid to the macro-particles. 
One application of the bilinear filtering (without compensation) is sufficient to suppress the spurious instability and 
produce a steady and clean wake. 



8. Appendix II: Perfectly Matched Layer 



The split form of Perfectly Matched Layer (PML) f52] framework applies readily to Eqs (43 )-(44 ). The equations 
on the component along z of the magnetic field are given by 



(A^ -h (Tj,) B,^ 

[At + O-y) B,y 
{At + (Ty)E^ 



-KEy 



-c^A,{B,,^B,y) 

C^Ay[B,,+B,y) 



(75) 
(76) 
(77) 
(78) 



where cTx and CTy are the absorbing layer coefficients along x and y respectively. The equations for the other compo- 
nents of the magnetic field and for the electric field are obtained similarly, applying the standard diff'erence operator 
on the spatial derivatives of the electric field and the enlarged diff'erence operator on the spatial derivatives of the 
magnetic field. The formula to update the fields is obtained by solving the finite-diff'erence equations or by integrating 
over one time step, giving 



B. 



'\i+l/2,j+l/2,k 



^ B T"^/^ 

^x-Dzx\i+i/2J+l/2,k 



^x-^y\i+l/2,j+l/2,k 



(79) 
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Table 3: List of parameters for scaled lOGeV class LPA stage simulation. 
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^y\ulil2,k = ^xEy\lj^^^2,k - ^^—7^^^ i^zx + B^y) tljl{%]^ (81) 

I — ^ 

^x\1+l/2j,k = ^y^ x\1+i/2, j,k + ^^^r^^y {^zx + B^y) \1tiilj,k (^^) 

^y 

where ^ = (1 - crSt/2) / (1 + crSt/2) via direct solve, or ^ = e~^^^ via time integration (note that in our tests, both 
implementations gave nearly identical results). 

The PML using the stencil given by ([82]) was tested and compared to the standard Yee implementation in 2D and 
3D. Fig. [33] snapshots from 2D simulations of the reflected residue from a PML layer of a pulse with amplitude given 
by the Harris function (10 - 15 cos(2nct/L) + 6 ^ cos(47ict/L) - cos(67ict/L))/ 32 where t is time, c is the speed 
of light and L = 50Sx is the pulse length in cell size units. A grid of 400x400 cells was used with Sx = Sy. The 
absorbing layer was 8 cells deep and the dependency of the PML coefficients with the index position / in the layer 
was (Ti = (Tm (iSx/Ay with cr^ = 4/Sx, A = 5Sx and n = 2. The alternative prescription for the coefficients given in 
(53JI54I, which reads cr* = (^/+i/2 - 1/^/) /Sx with ^/ = e~^'^^ and ct/ = cr^^ (iSx/AY, was also tested. 

For the generic test case that has been considered, the new implementation exhibited a very low residue of re- 
flections from the PML layer, which are qualitatively and quantitatively very similar to the residue obtained with a 
standard PML implementation. In agreement with results from 1531 [54l , the use of the modified coefficients cr* led to 
an order of magnitude improvement over the use of the standard coeflScients. 

The 3D tests gave similar absorption efficiency between the Yee and the new solver implementations of the PML, 
for all the CK solver coefficients given in Table [T] 

It was shown in |53]E3l that the efficiency of the layer can be improved further for the standard PML by augment- 
ing the equations with additional terms. However, a similar extension may not be readily available when using the 
Cole-Karkkainen stencil and is not considered here. 
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Figure 32: Snapshots from transverse electric field (normalized to maximum laser amplitude Eq) and plasma electrons longitudinal phase space 
projection, from a ID simulation of a laser wakefield acceleration stage the CFL limits (cSt = 6x) with (left) no filtering of current density; (right) 
application of a bilinear digital filter to the current density. 
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